Iterative Double Clustering for Unsupervised and Semi-Supervised Learning

نویسندگان

  • Ran El-Yaniv
  • Oren Souroujon
چکیده

We present a powerful meta-clustering technique called Iterative Double Clustering (IDC). The IDC method is a natural extension of the recent Double Clustering (DC) method of Slonim and Tishby that exhibited impressive performance on text categorization tasks [12]. Using synthetically generated data we empirically find that whenever the DC procedure is successful in recovering some of the structure hidden in the data, the extended IDC procedure can incrementally compute a significantly more accurate classification. IDC is especially advantageous when the data exhibits high attribute noise. Our simulation results also show the effectiveness of IDC in text categorization problems. Surprisingly, this unsupervised procedure can be competitive with a (supervised) SVM trained with a small training set. Finally, we propose a simple and natural extension of IDC for semi-supervised and transductive learning where we are given both labeled and unlabeled examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Hybrid Neural Network for Data Clustering

* Professor Sungyoung Lee is the corresponding author. Abstract Clustering plays an indispensable role for data analysis. Many clustering algorithms have been developed. However, most of them suffer either poor performance of unsupervised learning or lacking of mechanisms to utilize some prior knowledge about data (semi-supervised learning) for improving clustering result. In an effort to archi...

متن کامل

Combining Classifier Guided by Semi-Supervision

The article suggests an algorithm for regular classifier ensemble methodology. The proposed methodology is based on possibilistic aggregation to classify samples. The argued method optimizes an objective function that combines environment recognition, multi-criteria aggregation term and a learning term. The optimization aims at learning backgrounds as solid clusters in subspaces of the high...

متن کامل

Combining Classifier Guided by Semi-Supervision

The article suggests an algorithm for regular classifier ensemble methodology. The proposed methodology is based on possibilistic aggregation to classify samples. The argued method optimizes an objective function that combines environment recognition, multi-criteria aggregation term and a learning term. The optimization aims at learning backgrounds as solid clusters in subspaces of the high...

متن کامل

An Artificial Life Approach for Semi-supervised Learning

An approach for the integration of supervising information into unsupervised clustering is presented (semi supervised learning). The underlying unsupervised clustering algorithm is based on swarm technologies from the field of Artificial Life systems. Its basic elements are autonomous agents called Databots. Their unsupervised movement patterns correspond to structural features of a high dimens...

متن کامل

INTEGRATED ADAPTIVE FUZZY CLUSTERING (IAFC) NEURAL NETWORKS USING FUZZY LEARNING RULES

The proposed IAFC neural networks have both stability and plasticity because theyuse a control structure similar to that of the ART-1(Adaptive Resonance Theory) neural network.The unsupervised IAFC neural network is the unsupervised neural network which uses the fuzzyleaky learning rule. This fuzzy leaky learning rule controls the updating amounts by fuzzymembership values. The supervised IAFC ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001